Skip to content

Make telemetry batch size configurable and add time-based flush #622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 252 commits into
base: main
Choose a base branch
from

Conversation

saishreeeee
Copy link
Collaborator

@saishreeeee saishreeeee commented Jul 1, 2025

What type of PR is this?

  • Refactor
  • Feature
  • Bug Fix
  • Other

Description

The flush timer is centralized in TelemetryClientFactory, single background thread to manage all connections. Keeping it in TelemetryClient would mean creating a timer thread per connection.

Used threading.Thread with threading.Event. The threading.Event acts as a thread-safe shutdown signal, and its wait(timeout) method allows the thread to wait for the next flush interval while remaining immediately responsive to a shutdown command.
While threading.Timer could be used, it would create a new thread every flush interval as we need to create a timer after each execution.

How is this tested?

  • Unit tests
  • E2E Tests
  • Manually
  • N/A

Related Tickets & Documents

design doc

Jesse and others added 30 commits June 24, 2022 12:47
* Remove databricks-specific test fixtures
* Introduce pytest, black, and mypy workflows
* Skip test_fetches_bench case
* Remove references to databricks-specific testing infra
The code for version <=1.0 is not contained in this repository. There has been no decision 
about whether or not to open source it.
* Add license and contributing sections to README.
* Add environment setup docs to CONTRIBUTING
* Clarify example of connection details in example
* Add badges from pypi
* Explicitly call out Python 3.7 or above is needed
Signed-off-by: Jesse Whitehouse <[email protected]>
Advise developers to use Python 3.7, 3.8, or 3.9 until #26 is fixed.
Includes a GitHub Action which checks for a valid sign-off on every proposed commit
* Isolate delay bounding logic
* Move error details scope up one-level.
* Retry GetOperationStatus if an OSError was raised during execution. Add retry_delay_default to use in this case.
* Log when a request is retried due to an OSError. Emit warnings for unexpected OSError codes
* Update docstring for make_request
* Nit: unit tests show the .warn message is deprecated. DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead

Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
* Test with multiple python versions.
* Update pyarrow to version 9.0.0 to address issue in relation to python 3.10 & a specific version of numpy being pulled in by pyarrow.

Closes #26 

Signed-off-by: David Black <[email protected]>
* Update changelog and bump to v2.0.4
* Specifically thank @dbaxa for this change.

Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
* Add test: cursors are closed when connection closes

Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Moe Derakhshani <[email protected]>
Signed-off-by: Moe Derakhshani <[email protected]>
Signed-off-by: Moe Derakhshani <[email protected]>

my [OAuth PR](https://github.com/databricks/databricks-sql-python/runs/8005844758?check_suite_focus=true) is blocked due to dco validation (following error):
<img width="1202" alt="Screen Shot 2022-08-25 at 12 05 40 PM" src="https://user-images.githubusercontent.com/22279672/186747897-c9d57586-366f-41f9-aa66-609f2bf3911f.png">



We should try to avoid running dco for internal databricks employees:
I am trying to relax the validation based on this guideline:
https://github.com/dcoapp/app/blob/main/README.md#skipping-sign-off-for-organization-members

and here:
https://stackoverflow.com/questions/62969381/is-it-in-line-with-the-dco-that-a-github-sign-off-needs-and-publishes-full-name
Signed-off-by: Moe Derakhshani <[email protected]>
Signed-off-by: Moe Derakhshani <[email protected]>
Signed-off-by: Moe Derakhshani <[email protected]>

this is undo of #42 till we figure out how to fix dco
This PR:
* Adds the foundation for OAuth against Databricks account on AWS with BYOIDP.
* It copies one internal module that Steve Weis @sweisdb wrote for Databricks CLI (oauth.py). Once ecosystem-dev team (Serge, Pieter) build a python sdk core we will move this code to their repo as a dependency. 
* the PR provides authenticators with visitor pattern format for stamping auth-token which later is intended to be moved to the repo owned by Serge @nfx and and Pieter @pietern
Signed-off-by: Jesse Whitehouse <[email protected]>
Bump to v2.1.0 and update changelog

Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Sai Shree Pradhan <[email protected]>
…TelemetryClientFactory

Signed-off-by: Sai Shree Pradhan <[email protected]>
Signed-off-by: Sai Shree Pradhan <[email protected]>
@saishreeeee saishreeeee changed the base branch from telemetry to main July 14, 2025 13:05
Copy link

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Signed-off-by: Sai Shree Pradhan <[email protected]>
Copy link

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@saishreeeee saishreeeee deployed to azure-prod July 14, 2025 15:29 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.